Team, Visitors, External Collaborators
Overall Objectives
Research Program
Application Domains
Highlights of the Year
New Software and Platforms
New Results
Bilateral Contracts and Grants with Industry
Partnerships and Cooperations
Dissemination
Bibliography
XML PDF e-pub
PDF e-Pub


Section: Research Program

Multivariate decompositions

Multivariate decompositions provide a way to model complex data such as brain activation images: for instance, one might be interested in extracting an atlas of brain regions from a given dataset, such as regions exhibiting similar activity during a protocol, across multiple protocols, or even in the absence of protocol (during resting-state). These data can often be factorized into spatial-temporal components, and thus can be estimated through regularized Principal Components Analysis (PCA) algorithms, which share some common steps with regularized regression.

Let 𝐗 be a neuroimaging dataset written as an (nsubjects,nvoxels) matrix, after proper centering; the model reads

where 𝐃 represents a set of ncomp spatial maps, hence a matrix of shape (ncomp,nvoxels), and 𝐀 the associated subject-wise loadings. While traditional PCA and independent components analysis (ICA) are limited to reconstructing components 𝐃 within the space spanned by the column of 𝐗, it seems desirable to add some constraints on the rows of 𝐃, that represent spatial maps, such as sparsity, and/or smoothness, as it makes the interpretation of these maps clearer in the context of neuroimaging. This yields the following estimation problem:

where (𝐀i),i{1..nfeatures} represents the columns of 𝐀. Ψ can be chosen such as in Eq. (2) in order to enforce smoothness and/or sparsity constraints.

The problem is not jointly convex in all the variables but each penalization given in Eq (2) yields a convex problem on 𝐃 for 𝐀 fixed, and conversely. This readily suggests an alternate optimization scheme, where 𝐃 and 𝐀 are estimated in turn, until convergence to a local optimum of the criterion. As in PCA, the extracted components can be ranked according to the amount of fitted variance. Importantly, also, estimated PCA models can be interpreted as a probabilistic model of the data, assuming a high-dimensional Gaussian distribution (probabilistic PCA).

Ultimately, the main limitations to these algorithms is the cost due to the memory requirements: holding datasets with large dimension and large number of samples (as in recent neuroimaging cohorts) leads to inefficient computation. To solve this issue, online methods are particularly attractive [24].